Exploration via Model-based Interval Estimation
نویسندگان
چکیده
This paper takes an empirical approach to evaluating three model-based reinforcementlearning methods. All methods intend to speed the learning process by mixing exploitation of learned knowledge with exploration of possibly promising alternatives. We consider -greedy exploration, which is computationally cheap and popular, but unfocused in its exploration effort; R-Max exploration, a simplification of an exploration scheme that comes with a theoretical guarantee of efficiency; and a well-grounded approach, model-based interval estimation, that better integrates exploration and exploitation and achieves the best performance in our example tasks. Our experiments indicate that effective exploration can result in dramatic improvements in the observed rate of learning.
منابع مشابه
Interval Estimation for the Exponential Distribution under Progressive Type-II Censored Step-Stress Accelerated Life-Testing Model Based on Fisher Information
This paper, determines the confidence interval using the Fisher information under progressive type-II censoring for the k-step exponential step-stress accelerated life testing. We study the performance of these confidence intervals. Finally an example is given to illustrate the proposed procedures.
متن کاملA Theoretical Analysis of Model-Based Interval Estimation: Proofs
Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents the first theoretical analysis of MBIE, proving its efficiency even under worst-case conditi...
متن کاملA confidence-aware interval-based trust model
It is a common and useful task in a web of trust to evaluate the trust value between two nodes using intermediate nodes. This technique is widely used when the source node has no experience of direct interaction with the target node, or the direct trust is not reliable enough by itself. If trust is used to support decision-making, it is important to have not only an accurate estimate of trust, ...
متن کاملBayes Interval Estimation on the Parameters of the Weibull Distribution for Complete and Censored Tests
A method for constructing confidence intervals on parameters of a continuous probability distribution is developed in this paper. The objective is to present a model for an uncertainty represented by parameters of a probability density function. As an application, confidence intervals for the two parameters of the Weibull distribution along with their joint confidence interval are derived. The...
متن کاملEstimation of the mean grain size of mechanically induced Hydroxyapatite based bioceramics via artificial neural network
This study focuses on the estimation of the mean grain size of mechanically induced Hydroxyapatite (HA) through the artificial neural network (ANN) model. The mean grain size of HA and HA based nanocomposites at different milling parameters were obtained from previous studies. The data were trained and tested by the neural network modeling. Accordingly, all data (55 sets) were based on the mecha...
متن کامل